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Abstract. We present a logic for the specification of static analysis 
problems that goes beyond the logics traditionally used. Its most promi- 
nent feature is the direct support for both inductive computations of 
behaviors as well as co-inductive specifications of properties. Two main 
theoretical contributions are a Moore Family result and a parametrized 
worst case time complexity result. We show that the logic and the associ- 
ated solver can be used for rapid prototyping and illustrate a wide variety 
of applications within Static Analysis, Constraint Satisfaction Problems 
and Model Checking. In all cases the complexity result specializes to the 
worst case time complexity of the classical methods. 



1 Introduction 

Static analysis |12I20| is a successful approach to the validation of properties of 
programming languages. It can be seen as a two-phase process where we first 
transform the analysis problem into a set of constraints that, in the second 
phase, is solved to produce the analysis result of interest. The constraints may 
be expressed in a language tailored to the problem at hand, or they may be 
expressed in a general purpose constraint language such as Datalog [1|5| or ALFP 

Model checking |13l2j is an automatic technique for verifying hardware and 
more recently software systems. Specifications are expressed in modal logic, 
whereas the system is modeled as a transition system or a Kripke structure. 
Given a system description the model checking algorithm cither proves that the 
system satisfies the property, or reports a counterexample that violates it. 

Constraint Satisfaction Problems (CSPs) [TH] are the subject of intense re- 
search in both artificial intelligence and operations research. They consist of 
variables with constraints on them, and many real-world problems can be de- 
scribed as CSPs. A major challenge in constraint programming is to develop 
efficient generic approaches to solve instances of the CSP. 

In this paper we present a logic for specification of analysis problems that 
goes beyond the logics traditionally used. Its most prominent feature is the direct 
support for both inductive computations of behaviors as well as co-inductive 
specifications of properties. At the same time the approach taken falls within 
the Abstract Interpretation [918] framework, thus there always is a unique best 
solution to the analysis problem considered. We show that the logic and the 



associated solver can be used for rapid prototyping and illustrate a wide variety 
of applications within Static Analysis, Constraint Satisfaction Problems and 
Model Checking. 

One can notice a resemblance of the logic to modal /i-calculus |16I13] , which is 
extensively used in various areas of computer science such as e.g computer-aided 
verification. Its defining feature is the addition of least and greatest fixpoint 
operators to modal logic; thus it achieves a great increase in expressive power, 
but at the same time an equally great increase in difficulty of understanding. 

The paper is organized as follows. In Section [5] we define the syntax and 
semantics of LFP. In Section |3] we establish a Moore Family result and estimate 
the worst case time complexity. In Section 2] we show an application of LFP to 
Static Analysis. We continue in Section [5] with an application to the Constraint 
Satisfaction Problem. An application to Model Checking in presented in Section 
[HI We conclude in Section [T] 

2 Syntcix and Semantics 

In this section, we introduce Layered Fixed Point Logic (abbreviated LFP). 
The LFP formulae are made up of layers. Each layer can either be a define 
formula which corresponds to the inductive definition, or a constrain formula 
corresponding to the co-inductive specification. The following definition intro- 
duces the syntax of LFP. 

Definition 1. Given a fixed countable set X of variables, a non-empty universe 
lA, a finite set of function symbols J- , and a finite alphabet TZ of predicate symbols, 
we define the set of LFP formulae, els, together with clauses, cl, conditions, cond, 
constrains, con, definitions, def, and terms u by the grammar: 

u X I /(it) 

cond ::= R{x) \ -'R{x) \ condi A cond2 \ condi V cond2 

I 3a; : cond \ \fx : cond \ true \ false 
def cond R{u) \ Vx : def \ defi A rfe/2 
con ::= R{u) => cond \ \/x : con \ coni A con2 
cli ::= define{def) \ constrain{con) 
els ::= cli, . . . , els 

Here x ^ X , R E TZ, f Q F and 1 < i < s. We say that s is the order of the 
LFP formula cli, . . . ,clg. 

We allow to write R{u) for true => R{u), -^R{u) for R{u) ^ false and we 
abbreviate zero-arity functions /() a.s f <E U. Occurrences of R{x) and -^R{x) 
in conditions are called positive and negative queries, respectively. Occurrences 
of R{u) on the right hand side of the implication in define formulas are called 
defined occurrences. Occurrences of R{u) on the left hand side of the implication 
in constrain formulas are called constrained occurrences. Defined and constrained 
occurrences are jointly called assertions. 



In order to ensure desirable theoretical and pragmatic properties in the pres- 
ence of negation, we impose a notion of stratification similar to the one in Data- 
log |ll5j . Intuitively, stratification ensures that a negative query is not performed 
until the predicate has been fully asserted (defined or constrained). This is im- 
portant for ensuring that once a condition evaluates to true it will continue to 
be true even after further assertions of predicates. 

Definition 2. The formula cli, . . . ,clg is stratified if for all i = l,...,s the 
following properties hold: 

— Relations asserted in cli must not be asserted in c?i+i, . . . , cZg 

— Relations positively used in cli must not he asserted in cU+i, . . . , cZg 

— Relations negatively used in cU must not he asserted in cU, . . . ,cls 

The function rank : TZ — >■ {0, . . • , s} is then uniquely defined as 

rank(R) = max({0} U {i \ R is asserted in cli}) 

Example 1. Using the notion of stratification we can define equality eq and non- 
equality neq predicates as follows 

define{\/x : true eq{x, x)), define{\/x : Vy : -^eq{x, y) => neq{x, y)) 

According to Dcfinition[2]the formula is stratified, since predicate eq is negatively 
used only in the layer above the one that defines it. 

To specify the semantics of LFP we introduce the interpretations g, C and <; 
of predicate symbols, function symbols and variables, respectively. Formally we 
have 

In the above TZ/^ stands for a set of predicate symbols of arity k, then 7?. is a dis- 
joint union oiTZ/k, hence TZ^ l+Jfc ^/fc ■ Similarity F/k is a set of function symbols 
of arity k and 7^ = l+J^ The interpretation of variables is given by |x] (C, c) = 
<j(.t), where c,{x) is the element from U bound to x G X. Furthermore, the in- 
terpretation of function terms is defined as I/(m)](C,<?) = |/1(C7 [])(I^1(Cj 
It is generalized to sequences u of terms in a point-wise manner by taking 
=afor &\\a&U, and {{m, . . . ,Ukm,^) = {{uiM,^), ■ ■ ■ AMiC,^))- 
The satisfaction relations for conditions cond, definitions def and constrains 
con are specified by: 

{g.q) \= cond, (gX^^) N and {gXi'^) \= con 

The formal definition is given in Table [1] here <;[x a] stands for the mapping 
that is as except that x is mapped to a. 



Table 1. Semantics of LFP 



hR{x) iff W([],<?)G«?(^) 

{g,<;) ^^Rix) iff H([l,0^e(^) 

(qs) N condi A cond2 iff [Q,<i) \= condi and {q,<;) \= cond^ 

(Qj ?) N condi V condi iff (f?, ?) |= condi or (p, ?) |= condi 

{q, <;) \= 3x : cond iff (fj, H^> a]) |= cond for some a £U 

{Qj ?) 1= Va; : cond iff (fi, h-s- a]) |= cond for all a G W 

(£1, ?) 1= true iff always 

(i?) ?) 1= /ffl^se iff never 

iQ,(:,^)hRiu) iffM(COGe(^) 

C, ?) 1= def-^ A de/2 iff {g, (, <;) |= de/^ and {g, C, ?) h d-efi 
(i?)C)?) 1= cond ^ R{u) iff (^j, (^,<;) |= whenever {g,<;) \= cond 

{Qy Cy ?) N '^3; : def iff (fi, <;[x- n> a]) ^ de/ for all a G W 

{g.X,<;)hRiu) iff M(C,?) G e(^) 

(f?)C)?) 1= co'ii A C0712 iff (f?, C)?) N com and (f?, Ci?) 1= con2 
iO: Cy ?) N cond iff (fj, ?) 1= cond whenever {g, <;) \= R{u) 

{g, C, ?) 1= Vx : con iff (fj, <;[x 1-^ o]) |= con for all a G W 

{QyCy ^) \= ch,..., cls iff {g, (,,<;) \= cli for all 1 < i < s 



3 Optimal Solutions 

Moore Family. First we establish a Moore family result for LFP, which guaran- 
tees that there always is a unique best solution for LFP formulae. 

Definition 3. A Moore family is a subset Y of a complete lattice L ~ (L, C) 
that is closed under greatest lower bounds: VK' (~Y:\^Y'(zY. 

It follows that a Moore family always contains a least element, fl y, and a 
greatest element, fl 0, which equals the greatest element, T, from L\ in particular, 
a Moore family is never empty. The property is also called the model intersection 
property, since whenever we take a meet of a number of models we still get a 
model. 

Let A = {q \ Q : Y\f,TZ/i; Vipl'')} denote the set of interpretations g of 
predicate symbols in TZ over U. We define a lexicographical ordering CI defined 
by Qi E Q2 if and only if there is some < j < s , where s is the order of the 
formula, such that the following properties hold: 

(a) Qi{R) = Q2{R) for dl\R&Tl with rank{R) < j, 

(b) gi{R) C Q2{R) for all i? g 7^ with rank{R) = j and either j = or i? is a 
defined relation, 

(c) gi{R) ^ Q2{R) for all i? G 7?. with rank{R) = j and i? is a constrained 
relation, 

(d) cither j = s or gi{R) 7^ Q2{R) for some relation R ^TZ with rank(R) — j. 



Lemma 1. □ defines a partial order. 

Proof. See Appendix [Xj □ 
Lemma 2. (Z\, C) is a complete lattice with the greatest lower bound given by 

{f^{g{R) I g G Alj} if rank (R) = j and 
either j ~ or R is defined in clj . 
\J{g{R) I g e Mj} if rank [R) = j and 
R is constrained in clj. 

where 

Mj ={geM\ VR' : rank{R') < j ^ (|~| M){R') = g{R')} 
Proof. Sec Appendix [Bj □ 

Note that fl AI is well defined by induction on j observing that Mq = M 
and Mj C Mj_i. 

Proposition 1. Assume els is a .stratified LFP formula, <Jo and Co o-^e interpre- 
tations of the free variables and function symbols in els, respectively. Further- 
more, go is an interpretation of all relations of rank 0. Then {g \ (g, Co,<ro) \= 
els A Vi? : rank{R) = g{R) 71 f?o(^)} is a Moore family. 

Proof. See Appendix [Cl □ 

The result ensures that the approach falls within the framework of Abstract 
Interpretation |8|9j : hence we can be sure that there always is a single best 
solution for the analysis problem under consideration, namely the one defined 
in Proposition [TJ 



Complexity. The least model for LFP formulae guaranteed by Proposition[T]can 
be computed efficiently as summarized in the following result. 

Proposition 2. For a finite universe lA, the best solution g such that go ^ g of 
a LFP formula cZi, . . . , els (w.r.t. an interpretation of the constant symbols) can 
be computed in time 

O{\go\ + l^-l^M"') 

l<i<s 

where ki is the maximal nesting depth of quantifiers in the cli and \go\ is the 
sum of cardinalities of predicates go{R) of rank 0. We also assume unit time 
hash table operations (as in fTSjl). 



Proof. See Appendix iDl 



□ 



For define clauses a straightforward method that achieves the above complex- 
ity proceeds by instantiating all variables occurring in the input formula in all 
possible ways. The resulting formula has no free variables thus it can be solved 
by classical solvers for alternation-free Boolean equation systems (TU] in linear 
time. 

In case of constrain clauses we first dualize the problem by transforming the 
co-inductive specification into the inductive one. The transformation increases 
the size of the input formula by a constant factor. Thereafter, we proceed in the 
same way as for the define clauses. 

In addition we need to take into account the number of known facts, which 
equals to the cardinality of all predicates of rank 0. As a result we get the 
complexity from Proposition [2j 

The solver. Wc developed a state-of-the-art solver for LFP, which is implemented 
in continuation passing style using Haskell. The solver computes the least model 
guaranteed by Proposition [1] and has a worst case time complexity as given by 
Proposition [5J For many clauses it exhibits a running time substantially lower 
than the worst case time complexity. Indeed, |19| gives a formula estimating the 
less than worst case time complexity on a given clause. 

The solver deals with stratification by computing the relations in increasing 
order on their rank and therefore the negations present no obstacles. The re- 
lations are represented as Ordered Binary Decision Diagrams (OBDDs), which 
were originally used in hardware verification. OBDDs can efficiently store a large 
number of states that share many commonalities |4l3j . and have already been 
used in a number of program analyses proving to be very efficient. The algorithm 
is an extension of the symbolic algorithm presented in [TT] and is based on the 
top-down solving approach of Le Charlier and van Hentenryck [B] . 

The solver automatically translates LFP formulae into highly efficient OBDD 
implementations. Since the OBDDs represent sets of tuples, the solver operates 
on entire relations at a time, rather than individual tuples. The cost of the OBDD 
operations depends on the size of the OBDD and not the number of tuples in 
the relation; hence dense relations can be computed efficiently as long as their 
encoded representations are compact. 

4 Application to Data Flow Analysis 

Datalog has already been used for program analysis in compilers |25I22I23] . In 
this section we present how the LFP logic can be used to specify analyses that 
are instances of Bit- Vector Frameworks, which are a special case of the Monotone 
Frameworks [20114] . 

A Monotone Framework consists of (a) a property space that usually is a 
complete lattice L satisfying the Ascending Chain Condition, and (b) transfer 
functions, i.e. monotone functions from L to L. The property space is used 
to represent the data fiow information, whereas transfer functions capture the 
behavior of actions. In the Bit- Vector Framework, the property space is a power 



set of some finite set and all transfer functions are of the form fn{x) = {x \ 
killn) U gerin- 

Throughout the section we assume that a program is represented as a control 
flow graph [15120] . which is a directed graph with one entry node (having no 
incoming edges) and one exit node (having no outgoing edges), called extremal 
nodes. The remaining nodes represent statements and have transfer functions 
associated with them. 

Backward may analyses. Let us first consider backward may analyses expressed 
as an instance of the Monotone Frameworks. In the analyses, we require the least 
sets that solve the equations and we are able to detect properties satisfied by at 
least one path leading to the given node. The analyses use the reversed edges in 
the flow graph; hence the data flow information is propagated against the flow 
of the program starting at the exit node. The data flow equations are defined as 
follows 



where A{n) represents data flow information at the entry to the node n, E is a 
set of edges in the control flow graph, and t is the initial analysis information. 
The first case in the above equation, initializes the exit node with the initial 
analysis information, whereas the second one joins the data flow information 
from different paths (using the revered flow). We use IJ since we want be able 
detect properties satisfied by at least one path leading to the given node. 

The LFP specification for backward may analyses consists of two conjuncts 
corresponding to two cases in the data fiow equations. Since in case of may 
analyses we aim at computing the least solution, the specification is defined in 
terms of a define clause. The formula is obtained as 



The first conjunct initializes the exit node with initial analysis information, 
denoted by the predicate t. The second one propagates data flow information 
agains the edges in the control flow graph, i.e. whenever we have an edge (s, t) 
in the control flow graph, we propagate data flow information from t to s, by 
applying the corresponding transfer function. 

Notice that there is no explicit formula for joining analysis information from 
different paths, as it is the case in the data flow equations, but rather it is done 
implicitly. Suppose there are two distinct edges {s,p) and (s, q) in the flow graph, 
then we get 





Vx : l{x) ^ A{nexit,x) 
A(s.t)eE'^^ • (Mt^^) -^kills{x)) V gens{x) ^ A{s,x) 



yx : {A{p,x) A -^killsix)) W gens{x) =^ A{s,x) 



condp{x) 

Vx : {A{q, x) A -^kiUs{x)) V gens{x) =J> A{s, x) 



condq(x) 



which is equivalent to 



Vx : condp{x) V condq{x) ^ A{s,x) 



Forward must analyses. Let us now consider the general pattern for defining 
forward must analyses. Here we require the largest sets that solve the equations 
and wc arc able to detect properties satisfied by all paths leading to a given 
node. The analyses propagate the data flow information along the edges of the 
flow graph starting at the entry node. The data flow equations are defined as 
follows 



where A{n) represents analysis information at the exit from the node n. Since 
we require the greatest solution, the greatest lower bound p| is used to combine 
information from different paths. 

The corresponding LFP specification is obtained as follows 



Since we aim at computing the greatest solution, the analysis is given by means 
of constrain clause. The first conjunct initializes the entry node with the initial 
analysis information, whereas the second one propagates the information along 
the edges in the control flow graph, i.e. whenever we have an edge (s, t) in the 
control flow graph, we propagate data flow information from s to t, by applying 
the corresponding transfer function. 

The general patterns for defining forward may and backward must analyses 
follow similar pattern. In case of forward may analyses the data flow information 
is propagated along the edges of the flow graph and since we aim at computing 
the least solution, the analyses are given by means of define clauses. Backward 
must analyses, on the other hand, use reversed edges in the flow graph and are 
specified using constrain clauses. 

In order to compute the least solution of the data flow equations, one can 
use a general iterative algorithm for Monotone Frameworks. The worst case 
complexity of the algorithm is 0{\E\h), where l-E] is the number of edges in the 
control flow graph, and h is the height of the underlying lattice [20]. For Bit- 
Vector Frameworks the lattice is a powerset of a flnite set U; hence h is 
This gives the complexity 0(|£'||W|). 

According to Proposition[2]the worst case time complexity of the LFP specifi- 
cation is 0(|^?o|+Si<i<|_E| Since the size of the clause ck is constant and 
the sum of cardinalities of predicates of rank is 0(|A^|) we get 0(|7V| -I- |£^||iY|). 
Provided that \E\ > \N\ we achieve 0(|-E| |Z^|) i.e. the same worst case complexity 
as the standard iterative algorithm. 

It is common in the compiler optimization that various analyses are pre- 
formed at the same time. Since LFP logic has direct support for both least fixed 
points and greatest fixed points, we can perform both may and must analyses 
at the same time by splitting the analyses into separate layers. 




■.ry 




Vx : A{nentry,x) l{x) 

A(s,i)G-E^^ ■ A{t,x) {A{s,x) A -nkilltix)) V gent{x) 



5 Application to Constraint Satisfaction 

Arc consistency is a basic technique for solving Constraint Satisfaction Problems 
(CSP) and has various applications within e.g. Artificial Intelligence. Formally 
a CSP [18126] problem can be defined as follows. 

Definition 4. A Constraint Satisfaction Problem (N, C) consists of a finite 
set of variables N — {xi, . . . , a;„}, a set of domains D = {-Di, . • . , Dn\ , where 
Xi ranges over Di, and a set of constraints C C {cij | i,j G N}, where each 
constraint Cij is a binary relation between variables Xi and Xj . 

For simplicity we consider binary constraints only. Furthermore, we can rep- 
resent a CSP problem as a directed graph in the following way. 

Definition 5. A constraint graph of a CSP problem (N, D, C) is a directed 
graph G — (V, E) where V — N and E = {(xi, Xj) \ Cij £ C}. 

Thus vertices of the graph correspond to the variables and an edge in the 
graph between nodes Xi and Xj corresponds to the constraint G C. 

The arc consistency problem is formally stated in the following definition. 

Definition 6. Civen a CSP {N,D,C), an arc {xi,Xj) of its constraint graph 
is arc consistent if and only ifWx € Di, there exists y € Dj such that Cij{x,y) 
holds, as well as \/y G Dj, there exists x G Di such that Cij{x,y) holds. A CSP 
{N, D, C) is arc consistent if and only if each arc in its constraint graph is arc 
consistent. 

The basic and widely used arc consistency algorithm is the AC-3 algorithm 
proposed in 1977 by Mackworth [18]. The complexity of the algorithm is 0{ed^)^ 
where e is the number of constraints and d the size of the largest domain. The 
algorithm is used in many constrains solvers due to its simplicity and fairly good 
efficiency [24] . 

Now we show the LFP specification of the arc consistency problem. A domain 
of a variable Xi is represented as a unary relation Di, and for each constraint 
Cij G C we have a binary relation Cij E)i x Dj. Then we obtain 



which exactly captures the conditions from Definition (5] 

According to the Proposition [5] the above specification gives rise to the worst 
case complexity 0{ed'^). The original AC-3 algorithm was optimized in [26] where 
it was shown that it achieves the worst case optimal time complexity of 0{ed^). 
Hence LFP specification is as efficient as the improved version of the AC-3 
algorithm. 

Example 2. As an example let us consider the following problem. Assume we 
have two processes Pi and P2 that need to be finished before 8 time units have 
elapsed. The process Pi is required to run for 3 or 4 time units, the process P2 




Fig. 1. Arc consistency. 



is required to run for precisely 2 time units, and P2 should start at the exact 
moment when Pi finishes. 

The problem can be defined as an instance of CSP {N,D,C) where N = 
{si,S2} denoting the starting times of the corresponding process. Since both 
processes need to be completed before 8 time units have elapsed we have Di = 
D2 = {0, . . . , 8}. Moreover, we have the following constrains C = {ci2 = (3 < 
S2 — Si < 4),cii = (0 < Si < 4),C22 = (0 < S2 < 6)}. We can represent the 
above CSP problem as a constraint graph depicted in Figure [T] Furthermore it 
can be specified as the following LFP formulae 



where we write y — x for a function /^^^(y, x). 
6 Application to Model Checking 

This section is concerned with the application of the LFP logic to the model 
checking problem [5]. In particular we show how LFP can be used to specify a 
prototype model checker for a special purpose modal logic of interest. Here we 
illustrate the approach on the familiar case of Computation Tree Logic (CTL) 
[7]. Throughout this section, we assume that TS is finite and has no terminal 
states. 

CTL distinguishes between state formulae and path formulae. CTL state 
formulae over the set AP of atomic propositions are formed according to the 
following grammar 



where a £ AP and 1^9 is a path formula. CTL path formulae are formed according 
to the following grammar 

(f X<P I <?iU<?2 I 

where <P, (Pi and <p2 are state formulae. The satisfaction relation \= is defined 
for state formula by 



define ( A, 



,o<x<4 ^1(2;) A Ao<y<6 ^2(2/) A A3<.<4 Ci2( 
r (Va; : Di{x) 3y : D2{y) A C^iy - x))A 
y (Vy : D2{y) => 3x : Di{x) A C^y - x)) 




constrain 



(p true | a | ^1 A ^2 | "'^^ | Ei^ | A.(p 



s \= true iff true 

s 1= a iff a 6 L{s) 

s 1= -1^ iff not s 

s \^<Pi A<p2 iff s ^ (Pi and s |= 

s 1= Eip iff TT \^ (f for some tt G Paths{s) 

s \= Aip iff TT \= ip foY all TT e Paths{s) 



where Paths{s) denote the set of maximal path fragments tt starting in s. The 
satisfaction relation |= for path formulae is defined by 

tt\=X<P iff 7r[l] 1= <P 

TT h <PiU^2 iff 3j > : (7r[j] h A (VO < /c < j : 7r[fc] ^ 
TT ^G<P iff Vj > : Tr[j] |== <P 

where for path tt = sqSi . . . and an integer i > 0, 7r[i] denotes the (i + l)th state 
of TT, i.e. 7r[i] = s^. 

The CTL model checking amounts to a recursive computation of the set 
Sat{(l>) of all states satisfying which is sometimes referred to as global model 
checking. The algorithm boils down to a bottom-up traversal of the abstract 
syntax tree of the CTL formula The nodes of the abstract syntax tree cor- 
respond to the sub-formulae of "P, and leaves are either a constant true or an 
atomic proposition a S AP. 



Table 2. LFP specification of satisfaction sets 



define(\/s : Satij.^g{s)) 

define{ys : La{s) Sata{s)) 

define{ys : Sat4>i{s) ASat4>2{s) Sat<f ^ as>2 (s)) 

define{ys : -^Sat^{s) Sat^,p{s)) 

define{ys : (3s' : T{s,s') A SaU{s')) ^ 5'atEX<i.(s)) 

define{ys : (Vs' : -^T{s, s') W SaU{s')) ^ S'aUx<j.(s)) 

define ( ' '^"^^^(s) ^ SatE[*iU<j.2l (s))^ 

V(Vs : SaU^{s) A (3s' : T(s,s') A SatEi^iU^a] («')) ^ -S'atEls.iU.fal («)) 

define ( ' '^"**2(*) ^ '5atA[*iU*2l(s))A 

l^(Vs : 5at<j,i(s) A (Vs' : ^r(s, s') V S'atAis.iUs.^] («')) ^ SatA[s.iU<f 2] (s)) 

. / (Vs : 5'atEG*(s) 5'ats.(s))A \ 
constmin : Sat^G^(s) ^ (3s' : r(s,s') A Sat^G^{s'))) ) 

. f {\/s : SatAG<p{s) ^ SaU{s))/\ \ 
constrain ^^^^ : 5atAG*(s) ^ (Vs' : -r(s,s') V SaUG*(s'))) j 



Now let us consider the LFP specification, where for each formula ^ we 
define a relation Sat^ C S characterizing states where hold. The specification 
is defined in Table [5] The clause for true is straightforward and says that true 
holds in all states. The clause for an atomic proposition a expresses that a state 
satisfies a whenever it is in La, where we assume that we have a predicate La Q S 



for each a e AP. The clause for A 'P2 captures that a state satisfies 'Pi A <p2 
whenever it satisfies both <Pi and <l>2- Similarly a state satisfies if it does not 
satisfy The formula for EX^ captures that a state s satisfies EX^, if there is 
a transition to state s' such that s' satisfies <P. The formula for AX^ expresses 
that a state s satisfies AX<? if for all states s': either there is no transition 
from s to s', or otherwise s' satisfies <l>. The formula for E[^iU^2] captures two 
possibilities. If a state satisfies 'P2 then it also satisfies E[<?iU^2]- Alternatively 
if the state s satisfies 'Pi and there is a transition to a state satisfying E[^iU^2] 
then s also satisfies E[^iU<?2]- The formula A[<?iU<?2] also captures two cases. If 
a state satisfies 'P2 then it also satisfies A[<?iU<?2]- Alternatively state s satisfies 
A[f?iU<?2] if it satisfies Pi and for all states s' either there is no transition from 
s to s' or A[PiXJp2] is valid in s'. Let us now consider the formula for EG^. 
Since the set of states satisfying EGP is defined as a largest set satisfying the 
semantics of EG^, the property is defined by means of constrain clause. The first 
conjunct expresses that whenever a state satisfies EG^ it also satisfies 'P. The 
second conjunct says that if a state satisfies EGP then there exists a transition 
to a state s' such that s' satisfies EGP. Finally let us consider the formula 
for AG(p, which is also defined in terms of constrain clause and distinguishes 
between two cases. In the first one whenever a state satisfies AG<^, it also satisfies 
<P. Alternatively, if a state s satisfies AG'P then for all states s': either there is 
no transition from s to s' or otherwise s' satisfies AG'P. 

The generation of clauses for Sat,p is performed in the postorder traversal 
over 'P; hence the clauses defining sub-formulas of 'P are defined in the lower 
layers. It is important to note that the specification in Table [2] is both correct 
and precise. It follows that an implementation of the given specification of CTL 
by means of the LFP solver constitutes a model checker for CTL. 

We may estimate the worst case time complexity of model checking performed 
using LFP. Consider a CTL formula 'P of size |^|; it is immediate that the 
LFP clause has size 0(|^|), and the nesting depth is at most 2. According to 
Proposition [5] the worst case time complexity of the LFP specification is 0(15*1-1- 
where \S\ is the number of states in the transition system. Using a more 
refined reasoning than that of Proposition [2] we obtain 0(15*1 + |T||<?|), where 
\T\ is the number of transitions in the transition system. It is due to the fact 
that the " double quantifications" over states in Table [2] really correspond to 
traversing all possible transitions rather than all pairs of states. Thus our LFP 
model checking algorithm has the same worst case complexity as classical model 
checking algorithms 

Example 3. As an example let us consider the Bakery mutual exclusion algo- 
rithm jl7j . Although the algorithm is designed for an arbitrary number of pro- 
cesses, we consider the simpler setting with two processes. Let Pi and P2 be the 
two processes, and xi and X2 be two shared variables both initialized to 0. We 
can represent the algorithm as an interleaving of two program graphs [5] , which 
are directed graphs where actions label the edges rather than the nodes. The 
algorithm is as follows 




The variables xi and X2 are used to resolve the conflict when both processes 
want to enter the critical section. When Xi is equal to zero, the process Pi is not 
in the critical section and does not attempt to enter it — the other one can safely 
proceed to the critical section. Otherwise, if both shared variables are non-zero, 
the process with smaller "ticket" (i.e. value of the corresponding variable) can 
enter the critical section. This reasoning is captured by the conditions of busy- 
waiting loops. When a process wants to enter the critical section, it simply takes 
the next "ticket" hence giving priority to the other process. 

From the algorithm above, we can obtain a program graph corresponding to 
the interleaving of the two processes, which is depicted in Figure [2j 




Fig. 2. Interleaved program graph. 



The CTL formulation of the mutual exclusion property is AG-'{criti /\crit2), 
which states that along all paths globally it is never the case that criti and crit2 
hold at the same time. 

As already mentioned, in order to specify the problem we proceed bottom 
up by specifying formulae for the sub problems. After a bit of simplification we 



obtain the following LFP clauses 



define{\/s : Lcriti{s) A Lcrit^is) Satcrit{s)), 

constrain ( "-^^ " '^«^^G(-crit)(s) ^ -^Satc„tis))A \ 
\{Ws : SatAG{^crit){s) ^ (Vs' : -.T(s,s') V SatAG{^crit){s'j}) J 

where relation Lcriti (respectively Lcriti) characterizes states in the interleaved 
program graph that correspond to process Pi (respectively P2) being in the 
critical section. Furthermore, the AG modality is defined by means of a constrain 
clause. The first conjunct expresses that whenever a state satisfies a mutual 
exclusion property AGl-icrit) it does not satisfy crit. The second one states 
that if a state satisfies a mutual exclusion property then all successors do as 
well, i.e. for an arbitrary state, it is either not a successor or else satisfies the 
mutual exclusion property. 

7 Conclusions 

In the paper we introduced the Layered Fixed Point Logic, which is a suitable 
formalism for the specification of analysis problems. Its most prominent feature 
is the direct support for both inductive as well as co-inductive specifications of 
properties. 

We established a Moore Family result that guarantees that there always is 
a best solution for the LFP formulae. More generally this ensures that the ap- 
proach taken falls within the general Abstract Interpretation framework. Other 
theoretical contribution is the parametrized worst case time complexity result, 
which provide a simple characterization of the running time of the LFP pro- 
grams. 

We developed a state-of-the-art solving algorithm for LFP, which is a con- 
tinuation passing style algorithm based on OBDD representations of relations. 
The solver achieves the best known theoretical complexity bounds, and for many 
clauses exhibit a running time substantially lower than the worst case time com- 
plexity. 

We showed that the logic and the associated solver can be used for rapid pro- 
totyping by presenting applications within Static Analysis, Constraint Satisfac- 
tions Problems and Model Checking. In all cases the complexity result specializes 
to the worst case time complexity of classical results. 
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These appendices are not intended for publication and references to them 
will be removed in the final version. 

A Proof of Lemma [1] 
Proof. Reflexivity \/g G A : g \Z g. 

To show that g \Z g let us take j — s. If rank{R) < j then g{R) ~ g{R) as 
required. Otherwise if rank(R) = j and either i? is a defined relation or j = 0, 
then form g{R) — g{R) we get g{R) Q g{R)- The last case is when rank{R) = j 
and i? is a constrained relation. Then from g{R) = g{R) we get g{R) 5 g{R)- 
Thus wc get the required g Q g. 

Transitivity "igi, g2, ga & A : gi ^ g2 A g2 ^ gs ^ ^ Q3- 
Let us assume that gi ^ g2 A g2 ^ ga- From gi C g^^i we have ji such that 
conditions (jlf-Q arc fulfilled for i = 1,2. Let us take j to be the minimum of ji 
and j2. Now we need to verify that conditions (jl}-([d]) hold for j. If rank{R) < j 
we have gi{R) ~ g2{R) and g2{R) = Q3,{R)- It follows that g\{R) — g3{R), hence 
(jlj) holds. Now let us assume that rank{R) — j and either i? is a defined relation 
or j = 0. We have gi{R) C g2{R) and g2{R) ^ QsiR) and from transitivity of 
C we get gi{R) Q gaiR), which gives (jb|. Alternatively rank{R) = j and R is 
a constrained relation. We have gi{R) ^ Q2{R) and g2{R) 3 Qa{R) and from 
transitivity of 3 we get gi{R) 3 £13 (i?), thus (jcj) holds. Let us now assume that 
j 7^ s, hence gi{R) ^ gi+i{R) for some R ^ TZ and i = 1,2. Without loss of 
generality let us assume that gi{R) ^ g2{R)- In case i? is a defined relation we 
have gi{R) C g2{R) and g2{R) C £13 (i?), hence gi{R) 7^ g3{R). Similarly in case 
i? is a constrained relation wc have gi{R) 2 Q2{R) and g2{R) 3 Q-i{R)- Hence 
gi{R) ^ g^iR), and jd]) holds. 

Anti-symmetry \l gi, g2 <^ A : gi \^ g2 A g2 ^ Qi ^ Qi ^ Q2- 
Let us assume gi C g2 and g2 E Let j be minimal such that rank{R) = j 
and gi (R) ^ g2 (R) for some i? G 7?,. If j = or i? is a defined relation, then 
we have gi{R) Q g2{R) and g2{R) C gi{R). Hence gi{R) = g2{R) which is a 
contradiction. Similarly if i? is a constrained relation we have gi (R) 3 g2 {R) and 
Q2{R) 2 Qi{R)- It follows that gi{R) = g2{R), which again is a contradiction. 
Thus it must be the case that gi{R) — g2{R) for all R &TZ. □ 

B Proof of Lemma [2] 

Proof. First we prove that fl ^ is a lower bound of M; that is fl -^j'^ E ^? for all 
g G M. Let j be maximum such that g G Mj; since M = Mq and Mj D Mj+i 
clearly such j exists. From definition of Mj it follows that {\~\ M){R) = g{R) for 
all R with rank{R) < j; hence (jlj) holds. 

If rank{R) — j and either i? is a defined relation or j = we have (fl M){R) = 
r\{s'iR) I 9' e ^^j} ^ showing that © holds. 

Similarly, if i? is a constrained relation with rank{R) — j we have (H M){R) = 
Uig'iR) I £»' G A/j} D showing that (jsj) holds. 



Finally let us assume that j ^ s; we need to show that there is some R with 
rank{R) = j such that {\~\AI){R) ^ (}{R)- Since we know that j is maximum 
such that g G Mj, it follows that g ^ ^jfj+i, hence there is a relation R with 
rank{R) = j such that (fl M){R) ^ g{R); thus © holds. 

Now we need to show that \~\ M is the greatest lower bound. Let us assume that 
g' Q g for all g £ M, and let us show that g' Q \~\ M . U g' = \~\ M the result 
holds vacuously, hence let us assume ^ fl Then there exists a minimal j 
such that (PI M){R) ^ g'{R) for some R with rank{R) ~ j. Let us first consider 
R such that rank{R) < j. By our choice of j we have {\~\ M){R) = g'{R) hence 
(jlj) holds. 

Next assume that rank{R) = j and either i? is a defined relation of j ~ 0. Then 
g' Q g for ah g 6 Mj. It follows that g'{R) C g{R) for all g G Mj. Thus we 
have C (^^(-R) I £» £ M^}. Since (fl = f]{giR) \ Q £ Mj}, we have 

q'{R) C (nM)(i?) which proves ©. 

Now assume rank(R) = j and i? is a constrained relation. We have that g' ^ g 
for all g £ Mj. Since i? is a constrained relation it follows that g'{R) 3 g{R) 
for ah e Mj. Thus we have 2 UMR) I £• ^ ^^j}- Since (nA^)(-R) = 

Ulfi'C^) I 8 £ ^-G"}' have g'{R) D (fl Af)(i?) which proves (|cj). 
Finally since we assumed that (fl M){R) ^ g'{R) for some i? with rank{R) = j, 
it follows that ([d| holds. Thus we proved that ^?' C fl M. □ 

C Proof of Proposition [T] 

In order to prove Proposition [1] we first state and prove two auxiliary lemmas. 
Definition 7. We introduce an ordering C^^ defined by gi C^^ g2 if and only if 

- VR : rank{R) <j^ gi{R) = g2{R) 

- VR : rank{R) = j ^ gi {R) C g^llt) 

Lemma 3. Assume a condition cond occurs in clj, and let <; be a valuation of 
free variables in cond. If gi Cy^ g2 and (f3i,<;) \= cond then {g2,'i) [= cond. 

Proof. We proceed by induction on j and in each case perform a structural 
induction on the form of the condition cond occurring in clj . 
Case: cond ~ R{x) 
Assume gi C ^j g2 and 

{gi,<;)^R{x) 

From Table ID it follows that 

M([],Oe^>i(i?) 

Depending of the rank of R we have two sub-cases. 

(1) Let rank{R) < j, then from Definition [7] we know that gi{R) = g2{R) and 
hence 

H([],0ee2(i?) 



Which according to Table [T] is equivalent to 

(2) Let us now assume rank{R) = j, then from Definition [7] we know that 
&i{R) Q Q2{R) and hence 

lxl{[U)^Q2{R) 

which is equivalent to 

and finishes the case. 
Case: cond ~ -^R{x) 
Assume gi C/j q2 and 

From Tabled] it follows that 

M([],0^^?i(i?) 

Since rank{R) < j, then from Definition [7] we have gi{R) = Q^iR) and hence 

Which according to Table [T] is equivalent to 

Case: coni — cond\ A cond2 
Assume gi Q/j g2 and 

{gi,<;) \= condi A cond2 

From Table [T] it follows that 

H condi and {gi,'^) \= cond2 
The induction hypothesis gives 

{Q2,'^) h condi and (£'2,<r) h cond2 

Hence we have 

(f?2,T) H condi A cond2 

Case: cond = condi V cond2 
Assume gi C/j g2 and 

(QIj^) \= condi V cond2 

From Table [T] it follows that 

igi,<,) ^ condi or {gi,<;) [= cond2 



The induction hypothesis gives 

(£•2,?) h condi or {g2,<^) h cond2 

Hence we have 

{92, ^) [= condi V cond2 

Case: cond ~ 3x : cond' 
Assume gi C/j Q2 and 

(£"1: H ■ cond' 

From Tabled] it fohows that 

3a E U : {gi,<;[x H> a]) \= cond' 
The induction hypothesis gives 

3a G W : {g2,'i[x t-^ a]) \= cond' 
Hence from Table [T] we have 

(£"2, <;) \= 3x : cond' 

Case: cond ~ V.t : cond' 
Assume gi C/j g2 and 

(qij ^) H ■ cond' 

From Tabled] it follows that 

\/a eU : {gi,<;[x h- >■ a]) \= cond' 
The induction hypothesis gives 

ya eU : {g2,'^[x M> a]) |= cond' 
Hence from Table d] we have 

(£"2, <,) \=yx : cond' 

a 

Lemma 4. If g ^\~\M and {g' , (, c^) |= clj for all g' 6 M then {g, (, c;) |= clj. 

Proof. We proceed by induction on j and in each case perform a structural 
induction on the form of the clause cl occurring in clj . 
Case: clj — define{cond^ 
Assume 

Vg' e M : {g', C, <r) h cond => R{u) (1) 

Let us also assume 

{g, <;) [= cond 



Since g = \~\M we know that 

Vg' e M : g' (2) 

Let R' occur in cond. We have two possibihties; either rank{R') = j and R' is 
a defined relation, then from ([2]) if follows that g{R') C g'{R'). Alternatively 
rank{R') < j and from ^ it follows that g{R') ~ g'{R'). Hence from Definition 
[7] we have that g C^^- g' . Thus from Lemma |3] it follows that 

Vf)' G il/ : (^)', ^) 1= cond 

Hence from ([T]) we have 

Mg' (IM ■.{g'X,^)^R{u) 
Which from Table [1] is equivalent to 

Ve'eM:M(C,?)ee'(^) 

It follows that 

lul{C,^)e\J{g'{R)\g' eM} = g{R) 
Which from Table [1] is equivalent to 

{gX.^)^R{u) 

and finishes the case. 

Case: clj — define{deJi A def^) 

Assume 

Vp'eAf :(g',C,0h'^eAArfe/2 
From Tabic [1] we have that for all g' <E M 

(6'',C,<^) h defi and (g'X,'^) h def^ 
The induction hypothesis gives 

(6',C,?) h defi and (g,C,'^) h def^ 
Hence from Table [1] we have 

(£»,C,'?) 1= rfe/i A 

Case: cZ^ = de}ine{\lx : def) 
Assume 

V^^' eM: (e',C,0 hVx: rfe/ (3) 
From Tabic [T] we have that 

g' e M -.yaeU : {g' , C, -rla; ^ a]) h <^e/ 



Thus 

Va eU : g' e M : {q\ C, <r[x ^ a]) h def 
The induction hypothesis gives 

yaeU : {g, C, <,[x >-> a]) ^ def 

Hence from Table [1] we have 

{qX,'^) h : de/ 

Case: clj = constrain{R{u) cond) 
Assume 

Ve' e A/ : ((?', C, <r) h cond (4) 

Let us also assume 

From Table [T] it follows that 

lu}{C,,)e\J{g'{R)\g' GM} 
Thus there is some g' G M such that 

M(C,Oee'(i?) 

From JH) it follows that 
Since p = P| M we know that 

yg e M : g\Z g' (5) 

Let R' occur in cond. We have two possibilities; either rank(R') ~ j and R' is a 
constrained relation, then from ([S]) if follows that 3 g'{R'). Alternatively 

rank{R') < j and from ([S]) it follows that g{R') = g'{R'). Hence from Definition 
[7] we have that g' Cy^ g. Thus from Lemma |3] it follows that 

{g, z) \= cond 

which finishes the case. 

Case: clj = constrain{coni A C0712) 

Assume 

Vf)' £ M : (g', ^) 1= coni A con2 
From Table [T] we have that for all g' G M 

(£'',C,^) h coni and {g'Xj'^) 1= co7i2 

The induction hypothesis gives 

(£1, C,"?) h coni and (e, C)'^) h con2 



Hence from Table [T] we have 



{q, Ci h coni A con2 

Case: clj = constrain(yx : con) 
Assume 

V£<' e M : (£>',C,?) h : con (6) 
From Table [1] we have that 

g' e M -.^fa : {g' , C, ^[a; ^ a]) ^ con 

Thus 

ya E U : g' ^ M : {g' , C, <j[a; M- a]) ^ con 
The induction hypothesis gives 

\fa eU : {g, C, q[x a]) |= con 

Hence from Table [1] we have 

(.9, Ci <;) h '^2; : con 

□ 

Proposition [1] Assume els is a stratified LFP formula, ?o and (q are interpreta- 
tions of the free variables and function symbols in els, respectively. Furthermore, 
go is an interpretation of all relations of rank 0. Then {g \ (^?, Coj "Jo) h= els AW R : 
rank{R) = ^ q{R) 12 Qo{R)} is a Moore family. 

Proof. The result follows from Lemma |4l □ 



D Proof of Proposition [2] 

Proposition [2] For a finite universe 14, the best solution g such that £>o E ^? of 
a LFP formula eli, . . . , els (w.r.t. an interpretation of the constant symbols) can 
be computed in time 

Oi\go\ + Ic'^ll^l"') 

l<i<s 

where ki is the maximal nesting depth of quantifiers in the eli and I^iqI is the 
sum of cardinalities of predicates go{R) of rank 0. We also assume unit time 
hash table operations (as in [19)). 

Proof. Let ck be a clause corresponding to the i-th layer. Since cli can be either 
a define clause, or a constrain clause, we have two cases. 

Let us first assume that ck = define{def); the proof proceed in three phases. 
First we transform def to def ' by replacing every universal quantification Va; : 
def^i by the conjunction of all \IA\ possible instantiations of def^i, every existential 
quantification 3a; : cond by the disjunction of all \U\ possible instantiations of 



cond and every universal quantification Vx : cond by the conjunction of all \IA\ 
possible instantiations of cond. The resulting clause def is logically equivalent 
to def and has size 

&{\Undef\) (7) 

where k is the maximal nesting depth of quantifiers in def. Furthermore, def is 
boolean^ which means that there are no variables or quantifiers and all literals 
are viewed as nuUary predicates. 

In the second phase we transform the formula def\ being the result of the 
first phase, into a sequence of formulas def" ~ def\, . . . , def] as follows. We first 
replace all top-level conjunctions in def with Then we successively replace 
each formula by a sequence of simpler ones using the following rewrite rule 

condi V cond2 =^ R{u) H> condi Qnew, cond2 Qnew, Qnew => R(u) 

where Qnew is a fresh nuUary predicate that is generated for each application 
of the rule. The transformation is completed as soon as no replacement can be 
done. The conjunction of the resulting define clauses is logically equivalent to 
def. 

To show that this process terminates and that the size of def" is at most 
a constant times the size of the input formula def , we assign a cost to the 
formulae. Let us define the cost of a sequence of clauses as the sum of costs of 
all occurrences of predicate symbols and operators (excluding ","). In general, 
the cost of a symbol or operator is 1 except disjunction that counts 6. Then the 
above rule decreases the cost from /c + 7 to + 6, for suitable value of k. Since 
the cost of the initial sequence is at most 6 times the size of def, only a linear 
number of rewrite steps can be performed. Since each step increases the size at 
most by a constant, we conclude that the def" has increased just by a constant 
factor. Consequently, when applying this transformation to def\ wc obtain a 
boolean formula without sharing of size as in ([7]). 

The third phase solves the system that is a result of phase two, which can 
be done in linear time by the classical techniques of e.g. |10) . 

Let us now assume that the cli = constrain{con) . We begin by transform- 
ing con into a logically equivalent (modulo fresh predicates) define clause. The 
transformation is done by function fi defined as 

fi{constrain{con)) = define{g{con)), define{hi{con)) 



g{yx : con) — \/x : g{con) 

g{coni A 00712) = g[coni) A g{con2) 



g{R{u) cond) = {-^cond[R^{u)/^R{u)] R^(u)) 



hi{\/x : con) — \/x : hi{con) 

hi{coni A C0712) = hi{coni) A hi(con2) 

hi{R{u) =^ cond) = let cond' = cond[true / {R' (v) \ rank{R') = i)] in 
cond' ^ ^R^{u) R{u) 



where R is a new predicate corresponding to the complement of R. The size of 
the formula increases by a number of constraint predicates; hence the size of the 
input formula is increased by a constant factor. Then the proof proceeds as in 
case of define clause. 

The three phases of the transformation result in the sequence of define clauses 
of size 

0( ^ idm"') 

l<i<s 

which can then be solved in linear time. We also need to take into account the 
size of the initial knowledge i.e. the cardinality of all predicates of rank 0; thus 
the overall worst case complexity is 

0{\go\ + Ic'^ll^l"') 

l<i<s 

□ 



